Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Feature selection algorithm for imbalanced data based on pseudo-label consistency
Yiheng LI, Chenxi DU, Yanyan YANG, Xiangyu LI
Journal of Computer Applications    2022, 42 (2): 475-484.   DOI: 10.11772/j.issn.1001-9081.2021050957
Abstract396)   HTML22)    PDF (921KB)(115)       Save

Aiming at the problem that most algorithms of granular computing ignore the class-imbalance of data, a feature selection algorithm integrating pseudo-label strategy was proposed to deal with class-imbalanced data. Firstly, to investigate feature selection from class-imbalanced data conveniently, the sample consistency and dataset consistency were re-defined, and the corresponding greedy forward search algorithm for feature selection was designed. Then, the pseudo-label strategy was introduced to balance the class distribution of the data. By integrating the learned pseudo-label of a sample into consistency measure, the pseudo-label consistency was defined to estimate the features of the class-imbalanced dataset. Finally, an algorithm for Pseudo-Label Consistency based Feature Selection (PLCFS) for class-imbalanced data was developed based on the preservation of the pseudo-label consistency measure for the class-imbalanced dataset. Experimental results indicate that the proposed PLCFS has the performance only lower than max-Relevancy and Min-Redundancy (mRMR) algorithm, but outperforms Relief algorithm and algorithm for Consistency-based Feature Selection (CFS).

Table and Figures | Reference | Related Articles | Metrics